Take-home Exercise 1: Geospatial Analytics for Public Good

Author

Ng Meng Ye

Published

September 3, 2024

Modified

September 3, 2024

1 Overview

Thailand’s roads are the deadliest in Southeast Asia and among the worst in the world, according to the World Health Organisation. About 20,000 people die in road accidents each year, or about 56 deaths a day (WHO).

2 Getting Started

2.1 Objectives

In view of this, we need discover factors affecting road traffic accidents in the Bangkok Metropolitan Region BMR by employing both spatial spatio-temporal point patterns analysis methods.

The specific objectives of this take-home exercise are as follows:

  • To visualize the spatio-temporal dynamics of road traffic accidents in BMR using appropriate statistical graphics and geovisualization methods.

  • To conduct detailed spatial analysis of road traffic accidents using appropriate Network Spatial Point Patterns Analysis methods.

  • To conduct detailed spatio-temporal analysis of road traffic accidents using appropriate Temporal Network Spatial Point Patterns Analysis methods.

2.2 The Study Area

The focus of this study would in the Bangkok Metropolitan Region BMR, which includes the provinces:

  1. Bangkok

  2. Nonthaburi

  3. Nakhon Pathom

  4. Pathum Thani

  5. Samut Prakan

  6. Samut Sakhon

3 Data Preparation

3.1 Geospatial

These data sets are in shp format

  • Thailand Roads, available publicly from HDX
  • Thailand - Subnational Administrative Boundaries, available publicly from HDX

This data sets are in csv format

  • Thailand Road Accident [2019-2022], available publicly from Kaggle
set.seed(1234)
pacman::p_load(sf,raster,spatstat,tmap,tidyverse,sp,maptools)

4 Data Wrangling

Thailand Road Accident 2019-2022

Importing Attribute Data into R

We will import thai_road_accident_2019_2022.csv file into RStudio and save the file into an R dataframe called rdacc

rdacc <- read_csv("C:/ngmengye/ISSS626-GAA/Take-home_Ex/Take-home_Ex01/data/aspatial/thai_road_accident_2019_2022.csv")
Rows: 81735 Columns: 18
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr  (10): province_th, province_en, agency, route, vehicle_type, presumed_c...
dbl   (6): acc_code, number_of_vehicles_involved, number_of_fatalities, numb...
dttm  (2): incident_datetime, report_datetime

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

Filtering missing values

To define the geometry of each point, we need to use the latitude and longitude coordinates. Before converting the data frame to an sf object, we need to ensure that there are no missing values in the latitude and longitude columns.

Since only 359 out of 81,735 rows have missing values, it is reasonable to remove them.

lat_na <- sum(is.na(rdacc$latitude))
long_na <- sum(is.na(rdacc$longitude))

total_rows <- nrow(rdacc)

lat_na_pct <- (lat_na / total_rows) * 100
long_na_pct <- (long_na / total_rows) * 100

cat("Missing values in Latitude:", lat_na, "(", round(lat_na_pct, 2), "% )\n")
Missing values in Latitude: 359 ( 0.44 % )
cat("Missing values in Longitude:", long_na, "(", round(long_na_pct, 2), "% )\n")
Missing values in Longitude: 359 ( 0.44 % )

Creating a simple feature data frame from an aspatial data frame

The code chunk below converts rdacc data frame into a simple feature data frame by using st_as_st() of sf packages

# filter out NA or missing data of longitude and latitude
# check how many missing values (make sure less than 25%)
rdacc_sf <- rdacc %>% 
  filter(!is.na(longitude) & longitude != "",
         !is.na(latitude)& latitude != "") %>% 
  st_as_sf(coords = c(
    "longitude", "latitude"),
    crs = 4326) %>% 
  st_transform(crs = 32647)
Correcting the projection

EPSG: 4326 is wgs84 Geographic Coordinate System and EPSG: 32647 refers to the WGS 84 / UTM zone 47N Projected Coordinate System, which is specifically used for areas in Thailand.

Plotting the Aspatial Data

We use st_geometry to display basic information of the feature class such as type of geometry. It looks like the plot is showing points scattered across a wide area, which suggests that your data includes locations outside the Bangkok Metropolitan Region (BMR).

plot(st_geometry(rdacc_sf))

To focus only on data within the BMR, we need to filter the dataset for coordinates that fall within the region.

rdacc_bmr_sf <- rdacc_sf %>%
  filter(province_en %in% c("Bangkok", "Nonthaburi", "Nakhon Pathom", "Pathum Thani", "Samut Prakan", "Samut Sakhon"))

Our plot now shows road accident points within the Bangkok Metropolitan Region (BMR) overlaid on the road network. This looks much better than the previous scattered points across a broader region.

plot(st_geometry(rdacc_bmr_sf))

Thailand Roads

Importing Geospatial Data

We will import hotosm_tha_roads_lines shapefile into RStudio as sf data frames.

network <- st_read(dsn="C:/ngmengye/ISSS626-GAA/Take-home_Ex/Take-home_Ex01/data/geospatial", 
                   layer="hotosm_tha_roads_lines_shp")
Reading layer `hotosm_tha_roads_lines_shp' from data source 
  `C:\ngmengye\ISSS626-GAA\Take-home_Ex\Take-home_Ex01\data\geospatial' 
  using driver `ESRI Shapefile'
Simple feature collection with 2792590 features and 14 fields
Geometry type: MULTILINESTRING
Dimension:     XY
Bounding box:  xmin: 97.34457 ymin: 5.643645 xmax: 105.6528 ymax: 20.47168
CRS:           NA
network <- st_set_crs(network, 4326)
network32647 <- st_transform(network,32647)
network32647
Simple feature collection with 2792590 features and 14 fields
Geometry type: MULTILINESTRING
Dimension:     XY
Bounding box:  xmin: 325313.7 ymin: 624248.4 xmax: 1215576 ymax: 2263968
Projected CRS: WGS 84 / UTM zone 47N
First 10 features:
             name               name_en        highway  surface smoothness
1      ถนนฉลองกรุง    Chalong Krung Road      secondary    paved       <NA>
2  ซอยฉลองกรุง 1/1 Soi Chalong Krung 1/1    residential     <NA>       <NA>
3            <NA>                  <NA> secondary_link     <NA>       <NA>
4            <NA>                  <NA>        service     <NA>       <NA>
5      ถนนฉลองกรุง    Chalong Krung Road      secondary concrete       <NA>
6            <NA>                  <NA>        service     <NA>       <NA>
7     ถนนเอราวัณ 1         Erawan 1 Road       tertiary     <NA>       <NA>
8            <NA>                  <NA>           path  unpaved       <NA>
9            <NA>                  <NA>        service     <NA>       <NA>
10           <NA>                  <NA>    residential     <NA>       <NA>
   width lanes oneway bridge layer source        name_th     osm_id  osm_type
1   <NA>  <NA>    yes   <NA>  <NA>   <NA>     ถนนฉลองกรุง 1125681229 ways_line
2   <NA>  <NA>   <NA>   <NA>  <NA>   <NA> ซอยฉลองกรุง 1/1  594401607 ways_line
3   <NA>  <NA>    yes   <NA>  <NA>   <NA>           <NA>  472283206 ways_line
4   <NA>  <NA>   <NA>   <NA>  <NA>   <NA>           <NA>  594401608 ways_line
5   <NA>     2    yes    yes     1   Bing     ถนนฉลองกรุง  116847248 ways_line
6   <NA>  <NA>   <NA>   <NA>  <NA>   <NA>           <NA>  317485095 ways_line
7   <NA>  <NA>   <NA>   <NA>  <NA>   <NA>    ถนนเอราวัณ 1  378672881 ways_line
8   <NA>  <NA>   <NA>   <NA>  <NA>    GPS           <NA> 1238351123 ways_line
9   <NA>  <NA>   <NA>   <NA>  <NA>   <NA>           <NA>  909942692 ways_line
10  <NA>  <NA>   <NA>   <NA>  <NA>   <NA>           <NA>  694824299 ways_line
                         geometry
1  MULTILINESTRING ((693686.1 ...
2  MULTILINESTRING ((693358 15...
3  MULTILINESTRING ((692949.1 ...
4  MULTILINESTRING ((693256 15...
5  MULTILINESTRING ((692810.8 ...
6  MULTILINESTRING ((693877.2 ...
7  MULTILINESTRING ((677182.3 ...
8  MULTILINESTRING ((486572.6 ...
9  MULTILINESTRING ((629009.2 ...
10 MULTILINESTRING ((629703.9 ...

Thailand - Subnational Administrative Boundaries

Importing Geospatial Data

Thailand administrative level 0 (country), 1 (province), 2 (district), and 3 (sub-district, tambon) boundaries. Since we focus on Bangkok Metropolitan Region, we should import level 1 (province) data. We will import tha_admbnda_adm1_rtsd_20220121 shapefile into RStudio as sf data frames.

thai_map <- st_read(dsn="C:/ngmengye/ISSS626-GAA/Take-home_Ex/Take-home_Ex01/data/geospatial", 
                   layer="tha_admbnda_adm3_rtsd_20220121")
Reading layer `tha_admbnda_adm3_rtsd_20220121' from data source 
  `C:\ngmengye\ISSS626-GAA\Take-home_Ex\Take-home_Ex01\data\geospatial' 
  using driver `ESRI Shapefile'
Simple feature collection with 7425 features and 22 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: 97.34336 ymin: 5.613038 xmax: 105.637 ymax: 20.46507
Geodetic CRS:  WGS 84
thai_map_32647 <- st_transform(thai_map, crs = 32647)
st_crs(thai_map_32647)
Coordinate Reference System:
  User input: EPSG:32647 
  wkt:
PROJCRS["WGS 84 / UTM zone 47N",
    BASEGEOGCRS["WGS 84",
        ENSEMBLE["World Geodetic System 1984 ensemble",
            MEMBER["World Geodetic System 1984 (Transit)"],
            MEMBER["World Geodetic System 1984 (G730)"],
            MEMBER["World Geodetic System 1984 (G873)"],
            MEMBER["World Geodetic System 1984 (G1150)"],
            MEMBER["World Geodetic System 1984 (G1674)"],
            MEMBER["World Geodetic System 1984 (G1762)"],
            MEMBER["World Geodetic System 1984 (G2139)"],
            ELLIPSOID["WGS 84",6378137,298.257223563,
                LENGTHUNIT["metre",1]],
            ENSEMBLEACCURACY[2.0]],
        PRIMEM["Greenwich",0,
            ANGLEUNIT["degree",0.0174532925199433]],
        ID["EPSG",4326]],
    CONVERSION["UTM zone 47N",
        METHOD["Transverse Mercator",
            ID["EPSG",9807]],
        PARAMETER["Latitude of natural origin",0,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8801]],
        PARAMETER["Longitude of natural origin",99,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8802]],
        PARAMETER["Scale factor at natural origin",0.9996,
            SCALEUNIT["unity",1],
            ID["EPSG",8805]],
        PARAMETER["False easting",500000,
            LENGTHUNIT["metre",1],
            ID["EPSG",8806]],
        PARAMETER["False northing",0,
            LENGTHUNIT["metre",1],
            ID["EPSG",8807]]],
    CS[Cartesian,2],
        AXIS["(E)",east,
            ORDER[1],
            LENGTHUNIT["metre",1]],
        AXIS["(N)",north,
            ORDER[2],
            LENGTHUNIT["metre",1]],
    USAGE[
        SCOPE["Navigation and medium accuracy spatial referencing."],
        AREA["Between 96°E and 102°E, northern hemisphere between equator and 84°N, onshore and offshore. China. Indonesia. Laos. Malaysia - West Malaysia. Mongolia. Myanmar (Burma). Russian Federation. Thailand."],
        BBOX[0,96,84,102]],
    ID["EPSG",32647]]
thaiBMR <- thai_map_32647 %>%
  filter(ADM1_EN %in% c("Bangkok", "Nonthaburi", "Nakhon Pathom", "Pathum Thani", "Samut Prakan", "Samut Sakhon"))
thaiBMR
Simple feature collection with 477 features and 22 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: 587893.5 ymin: 1484414 xmax: 712440.5 ymax: 1579076
Projected CRS: WGS 84 / UTM zone 47N
First 10 features:
   Shape_Leng   Shape_Area                  ADM3_EN        ADM3_TH ADM3_PCODE
1  0.04769920 1.284175e-04 Phraborom Maharatchawang พระบรมมหาราชวัง   TH100101
2  0.03355050 6.068479e-05      Wang Burapha Phirom     วังบูรพาภิรมย์   TH100102
3  0.01728931 1.769761e-05         Wat Ratchabophit       วัดราชบพิธ   TH100103
4  0.01904576 1.920433e-05               Samran Rat     สำราญราษฎร์   TH100104
5  0.01523190 1.257312e-05         San Chaopho Suea    ศาลเจ้าพ่อเสือ   TH100105
6  0.01456981 1.292603e-05             Sao Chingcha        เสาชิงช้า   TH100106
7  0.03114989 4.079321e-05              Bowon Niwet        บวรนิเวศ   TH100107
8  0.01821968 1.566010e-05                Talat Yot        ตลาดยอด   TH100108
9  0.02230278 2.826988e-05          Chana Songkhram      ชนะสงคราม   TH100109
10 0.02904022 3.448890e-05            Ban Phan Thom       บ้านพานถม   TH100110
   ADM3_REF ADM3ALT1EN ADM3ALT2EN ADM3ALT1TH ADM3ALT2TH     ADM2_EN ADM2_TH
1      <NA>       <NA>       <NA>       <NA>       <NA> Phra Nakhon  พระนคร
2      <NA>       <NA>       <NA>       <NA>       <NA> Phra Nakhon  พระนคร
3      <NA>       <NA>       <NA>       <NA>       <NA> Phra Nakhon  พระนคร
4      <NA>       <NA>       <NA>       <NA>       <NA> Phra Nakhon  พระนคร
5      <NA>       <NA>       <NA>       <NA>       <NA> Phra Nakhon  พระนคร
6      <NA>       <NA>       <NA>       <NA>       <NA> Phra Nakhon  พระนคร
7      <NA>       <NA>       <NA>       <NA>       <NA> Phra Nakhon  พระนคร
8      <NA>       <NA>       <NA>       <NA>       <NA> Phra Nakhon  พระนคร
9      <NA>       <NA>       <NA>       <NA>       <NA> Phra Nakhon  พระนคร
10     <NA>       <NA>       <NA>       <NA>       <NA> Phra Nakhon  พระนคร
   ADM2_PCODE ADM1_EN      ADM1_TH ADM1_PCODE  ADM0_EN   ADM0_TH ADM0_PCODE
1      TH1001 Bangkok กรุงเทพมหานคร       TH10 Thailand ประเทศไทย         TH
2      TH1001 Bangkok กรุงเทพมหานคร       TH10 Thailand ประเทศไทย         TH
3      TH1001 Bangkok กรุงเทพมหานคร       TH10 Thailand ประเทศไทย         TH
4      TH1001 Bangkok กรุงเทพมหานคร       TH10 Thailand ประเทศไทย         TH
5      TH1001 Bangkok กรุงเทพมหานคร       TH10 Thailand ประเทศไทย         TH
6      TH1001 Bangkok กรุงเทพมหานคร       TH10 Thailand ประเทศไทย         TH
7      TH1001 Bangkok กรุงเทพมหานคร       TH10 Thailand ประเทศไทย         TH
8      TH1001 Bangkok กรุงเทพมหานคร       TH10 Thailand ประเทศไทย         TH
9      TH1001 Bangkok กรุงเทพมหานคร       TH10 Thailand ประเทศไทย         TH
10     TH1001 Bangkok กรุงเทพมหานคร       TH10 Thailand ประเทศไทย         TH
         date    validOn    validTo                       geometry
1  2019-02-18 2022-01-22 -001-11-30 MULTIPOLYGON (((661579 1521...
2  2019-02-18 2022-01-22 -001-11-30 MULTIPOLYGON (((662319.1 15...
3  2019-02-18 2022-01-22 -001-11-30 MULTIPOLYGON (((662329.2 15...
4  2019-02-18 2022-01-22 -001-11-30 MULTIPOLYGON (((662773.1 15...
5  2019-02-18 2022-01-22 -001-11-30 MULTIPOLYGON (((662036.7 15...
6  2019-02-18 2022-01-22 -001-11-30 MULTIPOLYGON (((662392.7 15...
7  2019-02-18 2022-01-22 -001-11-30 MULTIPOLYGON (((662789.3 15...
8  2019-02-18 2022-01-22 -001-11-30 MULTIPOLYGON (((662184.9 15...
9  2019-02-18 2022-01-22 -001-11-30 MULTIPOLYGON (((662015.7 15...
10 2019-02-18 2022-01-22 -001-11-30 MULTIPOLYGON (((662999.2 15...
plot(st_geometry(thaiBMR))
plot(rdacc_bmr_sf$geometry,add=T,col='red',pch = 19)

tmap_mode('view')
tmap mode set to interactive viewing
tm_shape(thaiBMR) + tm_polygons()+
  tm_shape(rdacc_bmr_sf$geometry) + 
  tm_dots()
#BMR_network <- st_intersection(network32647,thaiBMR)
#plot(st_geometry(BMR_network))
#tmap_mode('view')
#tm_shape(BMR_network) + tm_lines()
#tmap_mode('plot')